Robot Planning & Action
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (3 more...)
- Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Robot Talk Episode 142 – Collaborative robot arms, with Mark Gray
Mark Gray has worked in automation for the last 30 years, first involved in machine vision and robotics and finally collaborative robots or cobots. As country manager, Mark was the first person to work for Universal Robots in the UK and has carried out projects with many research institutes such as the Advanced Manufacturing Research Centre (AMRC), The Manufacturing Technology Centre (MTC), the National Robotarium, and Bristol Robotics Lab. Robot Talk is a weekly podcast that explores the exciting world of robotics, artificial intelligence and autonomous machines. Robot Talk is a weekly podcast that explores the exciting world of robotics, artificial intelligence and autonomous machines. In the latest episode of the Robot Talk podcast, Claire chatted to Razanne Abu-Aisheh from the University of Bristol about how people feel about interacting with robot swarms.
- Europe > United Kingdom (0.26)
- North America > United States > Texas (0.08)
Regret bounds for meta Bayesian optimization with an unknown Gaussian process prior
Bayesian optimization usually assumes that a Bayesian prior is given. However, the strong theoretical guarantees in Bayesian optimization are often regrettably compromised in practice because of unknown parameters in the prior. In this paper, we adopt a variant of empirical Bayes and show that, by estimating the Gaussian process prior from offline data sampled from the same prior and constructing unbiased estimators of the posterior, variants of both GP-UCB and \emph{probability of improvement} achieve a near-zero regret bound, which decreases to a constant proportional to the observational noise as the number of offline data and the number of online evaluations increase. Empirically, we have verified our approach on challenging simulated robotic problems featuring task and motion planning.
Latent Planning via Expansive Tree Search
Planning enables autonomous agents to solve complex decision-making problems by evaluating predictions of the future. However, classical planning algorithms often become infeasible in real-world settings where state spaces are high-dimensional and transition dynamics unknown. The idea behind latent planning is to simplify the decision-making task by mapping it to a lower-dimensional embedding space. Common latent planning strategies are based on trajectory optimization techniques such as shooting or collocation, which are prone to failure in long-horizon and highly non-convex settings. In this work, we study long-horizon goal-reaching scenarios from visual inputs and formulate latent planning as an explorative tree search. Inspired by classical sampling-based motion planning algorithms, we design a method which iteratively grows and optimizes a tree representation of visited areas of the latent space. To encourage fast exploration, the sampling of new states is biased towards sparsely represented regions within the estimated data support. Our method, called Expansive Latent Space Trees (ELAST), relies on self-supervised training via contrastive learning to obtain (a) a latent state representation and (b) a latent transition density model. We embed ELAST into a model-predictive control scheme and demonstrate significant performance improvements compared to existing baselines given challenging visual control tasks in simulation, including the navigation for a deformable object.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.60)
Scene-agnostic Hierarchical Bimanual Task Planning via Visual Affordance Reasoning
Lee, Kwang Bin, Kang, Jiho, Lee, Sung-Hee
Embodied agents operating in open environments must translate high-level instructions into grounded, executable behaviors, often requiring coordinated use of both hands. While recent foundation models offer strong semantic reasoning, existing robotic task planners remain predominantly unimanual and fail to address the spatial, geometric, and coordination challenges inherent to bimanual manipulation in scene-agnostic settings. We present a unified framework for scene-agnostic bimanual task planning that bridges high-level reasoning with 3D-grounded two-handed execution. Our approach integrates three key modules. Visual Point Grounding (VPG) analyzes a single scene image to detect relevant objects and generate world-aligned interaction points. Bimanual Subgoal Planner (BSP) reasons over spatial adjacency and cross-object accessibility to produce compact, motion-neutralized subgoals that exploit opportunities for coordinated two-handed actions. Interaction-Point-Driven Bimanual Prompting (IPBP) binds these subgoals to a structured skill library, instantiating synchronized unimanual or bimanual action sequences that satisfy hand-state and affordance constraints. Together, these modules enable agents to plan semantically meaningful, physically feasible, and parallelizable two-handed behaviors in cluttered, previously unseen scenes. Experiments show that it produces coherent, feasible, and compact two-handed plans, and generalizes to cluttered scenes without retraining, demonstrating robust scene-agnostic affordance reasoning for bimanual tasks.
- Workflow (0.67)
- Research Report (0.41)
- Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.86)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.68)
High-Performance Dual-Arm Task and Motion Planning for Tabletop Rearrangement
Zhang, Duo, Huang, Junshan, Yu, Jingjin
Abstract-- We propose Synchronous Dual-Arm Rearrangement Planner (SDAR), a task and motion planning (T AMP) framework for tabletop rearrangement, where two robot arms equipped with 2-finger grippers must work together in close proximity to rearrange objects whose start and goal configurations are strongly entangled. T o tackle such challenges, SDAR tightly knit together its dependency-driven task planner (SDAR-T) and synchronous dual-arm motion planner (SDAR-M), to intelligently sift through a large number of possible task and motion plans. Specifically, SDAR-T applies a simple yet effective strategy to decompose the global object dependency graph induced by the rearrangement task, to produce more optimal dual-arm task plans than solutions derived from optimal task plans for a single arm. Leveraging state-of-the-art GPU SIMD-based motion planning tools, SDAR-M employs a layered motion planning strategy to sift through many task plans for the best synchronous dual-arm motion plan while ensuring high levels of success rate. Comprehensive evaluation demonstrates that SDAR delivers a 100% success rate in solving complex, non-monotone, long-horizon tabletop rearrangement tasks with solution quality far exceeding the previous state-of-the-art. Experiments on two UR-5e arms further confirm SDAR directly and reliably transfers to robot hardware. Task and motion planning (T AMP) [1] represents a fundamental computation challenge in robotics, in which a robot system, e.g., one or more robot arms, must break down a given, potentially long-horizon task into suitable "bite-sized" sub-tasks that can be executed through short-horizon robot motions.
- North America > United States > New Jersey > Middlesex County > Piscataway (0.14)
- Europe > Germany > Berlin (0.04)
- Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)
db-LaCAM: Fast and Scalable Multi-Robot Kinodynamic Motion Planning with Discontinuity-Bounded Search and Lightweight MAPF
Moldagalieva, Akmaral, Okumura, Keisuke, Prorok, Amanda, Hönig, Wolfgang
State-of-the-art multi-robot kinodynamic motion planners struggle to handle more than a few robots due to high computational burden, which limits their scalability and results in slow planning time. In this work, we combine the scalability and speed of modern multi-agent path finding (MAPF) algorithms with the dynamic-awareness of kinodynamic planners to address these limitations. To this end, we propose discontinuity-Bounded LaCAM (db-LaCAM), a planner that utilizes a precomputed set of motion primitives that respect robot dynamics to generate horizon-length motion sequences, while allowing a user-defined discontinuity between successive motions. The planner db-LaCAM is resolution-complete with respect to motion primitives and supports arbitrary robot dynamics. Extensive experiments demonstrate that db-LaCAM scales efficiently to scenarios with up to 50 robots, achieving up to ten times faster runtime compared to state-of-the-art planners, while maintaining comparable solution quality. The approach is validated in both 2D and 3D environments with dynamics such as the unicycle and 3D double integrator. We demonstrate the safe execution of trajectories planned with db-LaCAM in two distinct physical experiments involving teams of flying robots and car-with-trailer robots.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Germany > Berlin (0.04)
- Asia > Japan (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
- Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.84)
OptMap: Geometric Map Distillation via Submodular Maximization
Thorne, David, Chan, Nathan, Robison, Christa S., Osteen, Philip R., Lopez, Brett T.
Abstract--Autonomous robots rely on geometric maps to inform a diverse set of perception and decision-making algorithms. As autonomy requires reasoning and planning on multiple scales of the environment, each algorithm may require a different map for optimal performance. Light Detection And Ranging (LiDAR) sensors generate an abundance of geometric data to satisfy these diverse requirements, but selecting informative, size-constrained maps is computationally challenging as it requires solving an NP-hard combinatorial optimization. In this work we present OptMap: a geometric map distillation algorithm which achieves real-time, application-specific map generation via multiple theoretical and algorithmic innovations. A central feature is the maximization of set functions that exhibit diminishing returns, i.e., submodularity, using polynomial-time algorithms with provably near-optimal solutions. We formulate a novel submodular reward function which quantifies informativeness, reduces input set sizes, and minimizes bias in sequentially collected datasets. Further, we propose a dynamically reordered streaming submod-ular algorithm which improves empirical solution quality and addresses input order bias via an online approximation of the value of all scans. T esting was conducted on open-source and custom datasets with an emphasis on long-duration mapping sessions, highlighting OptMap's minimal computation requirements. Open-source ROS1 and ROS2 packages are available and can be used alongside any LiDAR SLAM algorithm. ODERN autonomous systems use a modular software architecture with separate algorithms for perceiving the environment, planning collision-free paths, estimating vehicle motion, and making higher-level decisions to complete their tasks. Many of these algorithms depend on geometric information about the environment to function properly. As a result, their performance and processing time can vary greatly depending on the quality of the geometric data. For example, trajectory planners use geometric maps to plan collision-free paths, but the density of geometric data is critical for balancing real-time replanning requirements against reliable collision detection. This trade-off is best served by dense geometric maps that specifically capture the intended trajectory corridor (Figure 1a). In contrast, localization entails aligning a source and reference point cloud, a process best served by using a sparse and global reference point could to minimize computation time while maximizing alignment accuracy (Figure 1b). Distribution Statement A: Approved for public release; distribution is unlimited. Map is dense while remaining efficient as only points near the intended trajectory are returned.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Oceania > New Zealand > South Island > Marlborough District > Blenheim (0.04)
- North America > United States > Maryland > Prince George's County > Adelphi (0.04)
- (4 more...)
Building Gradient by Gradient: Decentralised Energy Functions for Bimanual Robot Assembly
Mitchell, Alexander L., Watson, Joe, Posner, Ingmar
Abstract-- There are many challenges in bimanual assembly, including high-level sequencing, multi-robot coordination, and low-level, contact-rich operations such as component mating. T ask and motion planning (T AMP) methods, while effective in this domain, may be prohibitively slow to converge when adapting to disturbances that require new task sequencing and optimisation. These events are common during tight-tolerance assembly, where difficult-to-model dynamics such as friction or deformation require rapid replanning and reat-tempts. Moreover, defining explicit task sequences for assembly can be cumbersome, limiting flexibility when task replanning is required. T o simplify this planning, we introduce BGBG, a decentralised gradient-based framework that uses a piecewise continuous energy function through the automatic composition of adaptive potential functions. This approach generates sub-goals using only myopic optimisation, rather than long-horizon planning. It demonstrates effectiveness at solving long-horizon tasks due to the structure and adaptivity of the energy function. We show that our approach scales to physical bimanual assembly tasks for constructing tight-tolerance assemblies. In these experiments, we discover that our gradient-based rapid replanning framework generates automatic retries, coordinated motions and autonomous handovers in an emergent fashion. Bimanual assembly is an inherently sequential planning problem that demands reasoning over tasks and motions. The challenge is further amplified in contact-rich settings or when collaborating with humans, making efficient and robust planning essential for reliable execution.
MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment
Zhang, Ruicheng, Zhang, Mingyang, Zhou, Jun, Guo, Zhangrui, Liu, Xiaofan, Xu, Zunnan, Zhong, Zhizhou, Yan, Puxin, Luo, Haocheng, Li, Xiu
Embodied imitation learning is constrained by the scarcity of diverse, long-horizon robotic manipulation data. Existing video generation models for this domain are limited to synthesizing short clips of simple actions and often rely on manually defined trajectories. To this end, we introduce MIND-V, a hierarchical framework designed to synthesize physically plausible and logically coherent videos of long-horizon robotic manipulation. Inspired by cognitive science, MIND-V bridges high-level reasoning with pixel-level synthesis through three core components: a Semantic Reasoning Hub (SRH) that leverages a pre-trained vision-language model for task planning; a Behavioral Semantic Bridge (BSB) that translates abstract instructions into domain-invariant representations; and a Motor Video Generator (MVG) for conditional video rendering. MIND-V employs Staged Visual Future Rollouts, a test-time optimization strategy to enhance long-horizon robustness. To align the generated videos with physical laws, we introduce a GRPO reinforcement learning post-training phase guided by a novel Physical Foresight Coherence (PFC) reward. PFC leverages the V-JEPA world model to enforce physical plausibility by aligning the predicted and actual dynamic evolutions in the feature space. MIND-V demonstrates state-of-the-art performance in long-horizon robotic manipulation video generation, establishing a scalable and controllable paradigm for embodied data synthesis.
- North America > United States (0.04)
- Asia > Middle East > Saudi Arabia > Asir Province > Abha (0.04)
- Asia > China > Hong Kong (0.04)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.35)
- Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.34)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.34)